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ABSTRACT 

Blind-source separation techniques are used to extract the transmission spectrum of the hot- 
Jupiter HD189733b recorded by the /f«66/e/NICMOS instrument. Such a 'blind' analysis of 
the data is based on the concept of independent component analysis. The de-trending of Hub- 
6Ze/NICMOS data using the sole assumption that nongaussian systematic noise is statistically 
independent from the desired light-curve signals is presented. By not assuming any prior, nor 
auxiliary information but the data themselves, it is shown that spectroscopic errors only about 
10 - 30% larger than parametric methods can be obtained for 11 spectral bins with bin sizes of 
~ 0.09/im. This represents a reasonable trade-off between a higher degree of objectivity for the 
non-parametric methods and smaller standard errors for the parametric de-trending. 

Results are discussed in the light of previous analyses published in the literature. The fact 
that three very different analysis techniques yield comparable spectra is a strong indication of 
the stability of these results. 

Subject headings: methods: data analysis — techniques: spectroscopic — planets and satellites: atmo- 
spheres — planets and satellites: individual(HD189733b) 



1. Introduction 

The held of exoplanetary spectroscopy is as 
rapidly advancing as it is new. It has come 
from the first detection of spectroscopic features 
in an exoplanetary atmosphere (Charbonneau et 
al. 2002), to an ever more detailed characterisa- 
tion of a variety of targets, (e.g. Agol et al. 2010; 
Beaulieu et al. 2010, 2011; Berta et al. 2012; 
Charbonneau et al. 2005, 2008; Deming et al. 
2005, 2007; Grillmair et al. 2008; Knutson et 
al. 2007a; Snellen et al. 2008, 2010a,b; Brogi 
et al. 2012; Bean et al. 2011; Stevenson et al. 



2010; Swain et al. 2008, 2009a,b; Tinetti et al. 
2007b, 2010; Crouzet et al. 2012). The aim to 
characterise smaller and smaller planets is equally 
a quest for higher and higher precision measure- 
ments, which are often limited by the systematic 
noise associated with the instrument with which 
the data are observed. This is particularly true for 
general, non-dedicated observatories. In the past, 
parametric models have extensively been used by 
several teams to remove instrument systematics 
(e.g. Agol et al. 2010; Beaulieu et al. 2010, 
2011; Brown et al. 2001; Burke et al. 2010; Char- 
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bonneau et al. 2005, 2008; Deming et al. 2007; 
Desert et al. 2011; Gibson et al. 2011; Grillmair 
ct al. 2008; Knutson et al. 2007a; Pont et al. 
2008; Sing et al. 2011; Swain et al. 2008,2009b). 
Parametric models approximate systematic noise 
via the use of auxiliary information of the instru- 
ment, the so called optical state vectors (OSVs). 
Such OSVs often include the X and Y-positional 
drifts of the star or the spectrum on the detector, 
the focus and the detector temperature changes, 
as well as positional angles of the telescope on the 
sky. By fitting a linear combination of OSVs to the 
data, the parametric approach derives its system- 
atic noise model. We refer to this as the 'linear, 
parametric' method. In the case of dedicated mis- 
sions, such as Kepler (Borucki ct al. 1996; Jenkins 
ct al. 2010), the instrument response functions 
are well characterised in advance and conceived 
to reach the required 10~ 4 to 10 -5 photometric 
precision. For general purpose instruments, not 
calibrated to reach this required precision, poorly 
sampled optical state vectors or a missing param- 
eterisation of the instrument often become critical 
issues. Even if the parameterisation is sufficient, it 
is often difficult to determine which combination 
of these OSVs may best capture the systematic 
effects of the instrument. 

Given the intricacies of a parametric approach, 
several groups have worked towards alternative 
methods to decorrelate the data from instrumen- 
tal and stellar noise. The issue of time-correlated 
systematics in exoplanetary time series was dis- 
cussed by Pont et al. (2006). Carter & Winn 
(2009) developed a wavelet based, non-parametric, 
de-trending of 1// noise contaminated lightcurves. 
Thatte et al. (2010) proposed a selective prin- 
cipal component filtering to reduce instrument 
and telluric systematics. More recently, Gibson 
ct al. (2012), here after G12, presented a non- 
parametric de-trending approach based on Gaus- 
sian Processes (GP; Rasmussen & Williams 2006). 
GP, as implemented by G12, belongs to the class 
of non-parametric, supervised machine learning al- 
gorithms. It uses the observed data and OSVs 
to derive an optimal, non-linear systematic noise 
model for the observed data. 

In Waldmann (2012), here after W12, we 
proposed independent component analysis (ICA; 
Hyvarinen 1999) as an effective way to de- 
correlate the exoplanetary signal from the instru- 



ment and stellar noise components. ICA belongs 
to the category of unsupervised machine learn- 
ing algorithms (i.e. the algorithm does not need 
to be trained prior to use). It does not require 
auxiliary information such as OSVs but only the 
observed data themselves. This approach is also 
known as blind dccorrclation as the assumptions 
on the system are minimal. As described later 
on, ICA assumes a linear combination of inde- 
pendent components (ICs) to form its systematic 
noise model. 

The issue of poorly constrained parameter 
spaces is in fact not new in astrophysics and has 
given rise to an increased interest in blind-source 
separation algorithms. Cosmological and extra- 
galactic observations, in particular, are often anal- 
ysed through fully blind, non-parametric methods 
and ICA has successfully been used to separate 
the cosmic microwave background (CMB) or the 
signatures of distant galaxies from their galactic 
foregrounds (e.g. Chapman et al. 2012; Stivoli 
ct al. 2006; Maino et al. 2002, 2007; Wang ct 
al. 2010). Aumont & Maci'as-Perez (2007), for 
instance, can separate the instrumental noise from 
the desired astrophysical signal by using ICA. 

In W12 the ICA approach was tested by pre- 
senting two single, de-correlated light-curves of 
the primary transit of HD189733b and XOlb ob- 
tained with Hubble /NICMOS in its spectroscopy 
setting. In this paper, we apply the ICA de- 
trending method to the extraction of an exoplan- 
etary spectrum; we then compare the results ob- 
tained to the previous ones published in the litera- 
ture and discuss the advantages and limits of this 
technique for analysing exoplanetary spectra and 
light-curves. 

1.1. The HD189733b Hubble/NICMOS data- 
set 

As explained in the introduction, the main fo- 
cus of our paper is the application and critical 
discussion of the ICA method to de-trend time- 
resolved spectroscopic data. We chose the transit 
spectrum of the hot-Jupiter HD189733b recorded 
by Hubble/NICMOS as a testbed, the main reason 
being that these data have been analysed by other 
teams in the past and the results of the analysis 
have been quite debated in the literature. 

The NICMOS data were first published by S08 
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and then re-analysed by Gibson et al. (2011) 
using a similar linear parametric de-trending ap- 
proach. Gibson et al. (2011), failing to retrieve 
the transmission spectrum reported by S08, at- 
tributed the discrepancy to a high degree of de- 
generacy between the systematic noise correction 
model used and the extracted spectrum. Swain 
et al. (2011) argued, though, this discrepancy 
being due to poorly derived OSVs by Gibson et 
al. (2011). A more recent re-analysis of the data 
by G12 using non-parametric, non-linear Gaussian 
Processes, found a solution consistent with that of 
S08, but error bars on average 2.4 times as large. 

Given these debates, it becomes critical to un- 
derstand how far we can push the analysis when no 
prior or auxiliary information for the instrument 
is assumed. 

2. Data analysis 

In this section we briefly introduce the ICA 
technique and we apply it to the primary tran- 
sit data of HD189733b, as recorded by Hub- 
&Ze/NICMOS. 

2.1. Independent Component Analysis 

Let us assume we have multiple, simultaneous 
observations of a mixture of signals. In time- 
resolved spectroscopy (spectrophotometry) we ob- 
tain a light-curve signal of the exoplanetary transit 
per resolution element of the spectrograph. Here, 
each measured time series is denoted by Xk(t), 
where k is its index in the individual time series 
and N is the total number of time series. The 
measured signal can be assumed to be the sum 
of the astrophysical light-curve signal, s a (t), the 
instrumental and stellar noise sources, s sn (t) and 
the white noise, s wn (t). So: 

Xk{t) = a k ,is a (t) + s wn (t) + 

(*)+ (i) 

+ a k . 3 s sn2 (t) + ... + a k:Nsn s sn (t) 

or as sum of vectors (the time-dependance has 
been dropped for clarity): 

Xk = a k ,is a + ak,2S wn + ^2 a k,l s sn (2) 
1=3 

where I is the estimated source signal index. For 
non-overcomplete sets, we have as many individ- 



ual source signals as time series, i.e. fc = I. 
Assuming only one source signal is astrophysi- 
cal and one is Gaussian per time series, we can 
state the total number of source signals to be 
N = N sn + 2, where N sn is the number of 
systematic noise source signals. 

It is worth noting that for overcomplete sets, we 
have more time series available than source signals 
contained within, i.e. k > I. In these cases we 
can reduce the dimensionality of the data set (for 
example using principal component analysis) or 
select a sub-set of estimated source signals given 
some selection criteria (W12). 

We can also express eq. 1 in matrix form as: 

x = As (3) 

where x is the column vector containing the mea- 
sured time series, x k , i.e. x = (x\, x 2 , ■ ■ ■ , xn) T , 
s is the column vector of independent source sig- 
nals, s = (s a ,s wn ,s sn i,s sn2l ■ ■ ■ ,s N ) T . We may 
also write s = s a + s wn + s sn to clearly differen- 
tiate between the astrophysical, white noise and 
systematic components. A is the N x N dimen- 
sional 'mixing matrix' comprised of the weights 

The motivation of ICA is to estimate A without 
prior knowledge of A or s (Hyvarinen 1999). This 
is achieved by making the stringent assumption 
that the source signals composing s, s k , are statis- 
tically independent from each other. Such an as- 
sumption is valid, as one expects the astrophysical 
light-curve signal to be independent in origin from 
systematic instrumental noise. ICA algorithms 
hence attempt to de-compose observed signals, x, 
to a set of independent source components, s. To 
maximise said independence, several approaches 
have been proposed (for a comprehensive sum- 
mary: Hyvarinen & Oja. 2001; Comon & Jutten 
2000) . Here we follow W12 and use a variant of the 
FastICA algorithm, which maximises statistical 
independence of the estimated source signals s; by 
maximising the nongaussianities of their respec- 
tive probability distributions (Hyvarinen 1999; 
Hyvarinen et al. 1999; Hyvarinen & Oja 2000; 
Koldovsky et al. 2006; Hyvarinen & Oja. 2001; 
Comon & Jutten 2000). 

A linear combination of individual source sig- 
nals, as in eqs. 1 and 2, is hence a valid assump- 
tion as each component is assumed to be fully 
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independent. However, a few subtleties regard- 
ing this approach are worth mentioning. Only 
the signals that are common to all time series, 
Xk(t), can realistically be de-convolved. Taking 
Hubble/NICMOS as example, systematics intro- 
duced by the grism are easily detectable, as these 
are 'common' to all dispersed wavelengths. Sim- 
ilarly, detector-wide flat-fielding gradients are de- 
pendable. On the other hand, localised inter-pixel 
variations are not represented in all time series and 
would not be de-correlated with ICA. These prop- 
erties lead to the discussion of 'global' and 'local' 
noise models later on in the text. Whilst the ef- 
fect of limb-darkening is reduced in the infra-red, 
it is true that wavelength varying limb-darkening 
coefficients can impair the direct extraction of the 
lightcurve feature. However, as described in the 
next section, the systematic noise model makes no 
direct use of the astrophysical component, s a , and 
hence circumvents this potential limitation. 

2.2. Application to HD189733b 

The primary transit of HD189733b was ob- 
served using ffST/NICMOS in the G206 grism 
setting and spanning five consecutive orbits. The 
selected grism covers the spectral range of ^1.51 
- 2.43 fj,m, see S08. The #ST-pipeline calibrated 
data were downloaded from the MAST 1 archive 
and the spectrum was extracted using both stan- 
dard IRAF 2 routines as well as a custom built rou- 
tine for optimal spectral extraction. Both extrac- 
tions are in good accord with each other but the 
custom built routine was found to yield a better 
signal to noise and was subsequently adopted for 
all further analysis. In order to minimise inter- 
and intra-pixel variability of the NICMOS detec- 
tor, the instrument was slightly de-focused to a 
full- width-half-maximum (FWHM) of ^5 spectral 
channels per resolution clement. This sets a limit 
on the maximum resolution, R, achievable. 

ICA is limited by the amount of Gaussian noise 
present in the data (Hyvarinen & Oja 2000). We 
found the minimal binning of 5 channels to be 
too low in SNR and used a binning of 8 spectral 
channels (~ 0.09^im). Several data pre-processing 
methods exist to decrease the Gaussian compo- 
nent of time series data (e.g. kernel smoothing, 



1 http://archive.stsci.edu/ 
2 http://iraf.noao.edu/ 



low pass filters, wavelet based approaches, etc.) 
but we decided to interfere as little as possible with 
the original data and opted for a slightly coarser 
binning instead. This resulted in 11 light curves 
across the G206 grism band. We found the first 
of the 5 orbits to be very noisy and negatively 
impacting the efficiency of the algorithm and ex- 
cluded the first orbit from all further analysis. An 
example of the 'raw' light-curves' quality, at ^2.33 
/zm, can be found in figure 1. 

As described in W12, we used the extracted 
light-curves as input to the ICA algorithm to cal- 
culate the mixing matrix, A, and its (pseudo)inverse, 
the de-mixing matrix W = A -1 . Once the de- 
mixing matrix had been determined, the algo- 
rithm tested the estimated components for their 
nongaussianity and returned four main system- 
atic noise components which do not correlate with 
the expected light-curve morphology. These com- 
ponents, comprising s sn , were extracted over the 
entire spectral range of the grism (referred to as 
'global' below) and showed a good degree of sep- 
aration (figure 2, left side). ICA estimates the 
mixing matrix A up to a sign and scaling factor, 
meaning the source signals, s, are recovered but 
lack an overall scaling constant. In an analogy to 
principal component analysis, we can think of this 
as recovering the eigenvectors but not the eigen- 
values. Due to this ambiguity, we need to deter- 
mine the scaling of the individual systematic noise 
components, per time series Xk, separately. This is 
done by fitting a systematic-noise-model (SNM), 
to/j, to the out-of-transit data of each time series 
Xk, where nik is the sum of all scaled systematic 
noise components in s sn , i.e. = X)/I=T °k.i s sn.i, 
where Okj is the scaling factor for the systematic 
noise signal s sn j for a given time series k. A 
Nelder-Mead minimisation algorithm (Press et al. 
2007) was used to fit for Ok,i- The scaling am- 
plitude of each component for each light-curve is 
given in figure 2 (right side). Once nik is deter- 
mined, we subtract it from the raw data to get the 
corrected time series yk — Xk — mk, see figure 1 
for an example. 

2.3. Lightcurve fitting and Error-bars 

Having obtained the de-trended time series, yk, 
we proceed to model-fit these using the analytical 
model by Mandel & Agol (2002), from here MA02, 
with orbital and limb-darkening parameters taken 
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Fig. 1. — Raw light-curve at ^2.33 /on (black 
crosses), its respective systematic noise model (red 
squares), mk(t), composed out of the systematic 
components in figure 2. The de-trended final light- 
curve is shown underneath (blue circles) with a 
Mandel & Agol (2002) fit overlaid. 
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from S08. The S08 quadratic limb-darkening pa- 
rameter are interpolated to the coarser wavelength 
grid of this analysis. The transit depth, 6 k , is left 
as only free parameter. The transit depths of all 11 
light-curves constitute the exoplanetary spectrum. 
The model-fitting was performed using a Markov 
Chain Monte Carlo (MCMC) algorithm and cross 
checked using two variants of a Bootstrap Monte 
Carlo analysis (see appendix A). 

2.3.1. Markov Chain Monte Carlo 

MCMC (Press et al. 2007) has become the 
standard fitting routine for exoplanetary time se- 
ries and radial velocity data (e.g. Ford 2006; 
Burke et al. 2007; Bakos et al. 2007; Knutson 
et al. 2007a; Cameron et al. 2007; Charbonneau 
et al. 2009; Bean et al. 2011; Kipping & Bakos 
2011; Gregory 2011; Crouzet et al. 2012; Knutson 
et al. 2012). In this analysis we use an adaptive 
version of the standard Metropolis-Hastings algo- 
rithm (Haario et al. 2001, 2006; Metropolis et 
al. 1953; Hastings 1970; Press et al. 2007). The 
only free parameter, 6 k , was set to have a uniform 
prior ranging from Rp/R* = - 1. As we only 
fit for the transit depth we are not concerned by 
inter-parameter correlations in this analysis. 

We perform an initial MA02 model fit, rriMA02,k(t), 
using a Nelder-Mead minimisation algorithm 
(Press et al. 2007) and calculate the model sub- 
tracted residual, rfc(t) = Xk(t) — niMA02,k(t)- We 
take the fitted transit depth as starting value of 
the MCMC chain and calculate the variance of the 
normal sampling distribution as 



Fig. 2. — LEFT: Four retrieved nongaussian sys- 
tematic noise components in the order of impor- 
tance. They were computed over the whole spec- 
tral range of the G206 grism and describe the sys- 
tematic noise (instrumental and/or stellar) com- 
mon to all spectral channels. RIGHT: Scaling fac- 
tors, Ok.i, of the systematic noise components on 
the left. The colour coding is identical for both 
plots. We can see that the first component at 
2.06 /xm is sharply deviating from its own pass- 
band mean and the mean of all components at 
2.06 /im. This can indicate that the 'global' sys- 
tematic noise model does not describe well the sys- 
tematics in this channel and could over-correct. 



^sample = var(r k (t)) + 2 cov[r k (t), r k (t + r)] 

T = l 

(4) 

where var is the variance of the residual and cov 
the auto-covariance for a given lag r. This ac- 
counts for remaining autocorrelated noise in the 
time series data. The MCMC algorithm was con- 
sequently run for 2 x 10 4 iterations to guarantee 
a good coverage of the posterior distribution, of 
which examples are shown in figures 6a & 7a. Here 
the error bars are the standard deviation of the 
posterior distribution. 

In addition to the MCMC algorithm described 
here, we also estimated the standard error us- 
ing two variants of a Bootstrap Monte-Carlo al- 



■5 



gorithm, see appendix A. Wc find the retrieved 
transit-depths and errors to be in good agreement 
with the MCMC method. 

2.3.2. Source signal separation error 

The two core algorithms used in this analysis 
EFICA (Koldovsky et al. 2006) and WASOBI 
(Yeredor 2000; Tichavsky et al. 2006b) can be 
shown to be asymptotically efficient, i.e. reaches 
the Cramer-Rao Lower Bound (CRLB) in an ideal 
case where the nonlinearity G(.) equals the signal's 
score function. In other words, the algorithms em- 
ployed here can be shown to converge to the cor- 
rect solution given the original source signals and 
in the limit of Ni ter — > oo iterations. 

In reality the number of iterations is finite and 
imperfect convergence results in traces of other 
sources to remain in the individual signals com- 
prising s. We can hence state that the estimated 
de-mixing matrix, W is only approximately equal 
to the inverse of the original mixing matrix, A, i.e. 

W ~ A" 1 (5) 

This requires us to calculate the signal separation 
error (SSE) of the analysis. A measure of this error 
is the deviation of WA from the unity matrix by 
inspecting the variance of its elements (Koldovsky 
et al. 2006; Hyvarinen & Oja. 2001). 

To assert a good degree of separation, we can 
define G as the gain matrix. For a perfectly es- 
timated de-mixing matrix, W, the gain matrix is 
equal to its identity matrix 

G = WA = I (6) 

In signal processing, the performance of blind- 
source separation algorithms is usually measured 
by the signal over inference ratio, SIR 3 . The SIR is 
the standard measure in signal processing of how 
well a given signal has been transmitted or de- 
convolved from a mixture of signals. Given the 
inference to be a noise source, we can equate this 
to the more commonly used signal-to-noise ratio 
(SNR). We can now calculate the separation error 
of the estimated source signal, si, in relation to 
the original source signal, Sk, using 



standard literature also refers to its inverse, the interference 
over signal ratio, ISR 



°l = ^f-j , k, I = 1, 2, N. (7) 

However, the original mixing matrix, A, and 
the original source signals, are not generally 
known for real data sets and equation 7 is only 
useful in the case of simulations. Tichavsky et al. 
(2006a); Koldovsky et al. (2006) and Tichavsky 
et al. (2008) have shown that despite the origi- 
nal mixing matrix being unknown, an asymptotic 
estimate of the SIR can be made. A derivation 
of this process is beyond the scope of this paper 
and we refer the interested reader to the relevant 
literature. 

We now have the retrieval error on the individ- 
ual source components, cr; . In order to obtain the 
overall error on the systematic noise model, m,k, 
calculated in section 2.2, we compute the weighted 
sum of the individual source separation errors with 
the previously retrieved source component weight- 
ing factors, Ofc ; / 

(Nsn \ 1 / JV «» 

°k,l<7l\ ( 8 ) 

2.3.3. SNM fitting error 

In addition to the above determined errors, wc 
also include an ICA fitting error which accounts 
for possible over-corrections of the global SNM 
to individual, poorer constraint light-curves. This 
term becomes non-zero when the scaling of a sys- 
tematic noise component, Ok,i (figure 2, right hand 
side), shows a 3c significant deviation from the 
mean scaling of all other light-curves, cV In other 
words, we expect the scaling of an individual sys- 
tematic noise component, s sn , to be a slowly vary- 
ing function over wavelength for 'globally' esti- 
mated systematic-noise components. If individual 
light-curves show a significantly larger positive or 
negative scaling than expected, for an individual 
lightcurve, we can assume the nongausian noise 
of the affected light-curve not to be properly ac- 
counted for by this global model. In larger data- 
sets it is easier to exclude the affected light-curve 
from any further analysis, whilst in small data-sets 
we take the amplitude of the scaling from its mean 
scaling as the error, i.e. 
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(A) Systematic noise components 



0~k,SNM 



(Ok J ~ O k \ 



(9) 



In this analysis we find the ICA fitting error to 
be zero for all wavelengths but the 2.06/im spectral 
point. 



2.3.. 



Final Error Bar 



In summary the final error bar per time series, 
k, consists of: 

1. Standard error: Estimating the variance 
in retrieved transit depth when model fitting 
of the de-trended lightcurves. 

2. Signal separation error: Estimating the 
ICA component separation error. 

3. Systematic noise model error: Esti- 
mating errors due to noise model over- or 
under-fitting for individual time series. 

We now define the final error to be the sum of 
squares of the above mentioned error sources: 



0~k,TOTAL 



Results 



°~k,MCMC + a l,SSE 



k,SNM 



(10) 



Figure 1 shows the raw light-curve at ^2.33 /im 
in black crosses with its corrected counterpart 
(blue circles) offset below. Here, much of the au- 
toregressive noise in the original data could be 
captured by the noise model (red squares) and re- 
moved from the final result. All raw and corrected 
light-curves are shown in figure 3 with their re- 
spective systematic noise models. The resulting 
spectrum is presented in figure 4 and table 1. We 
find the retrieved spectrum to be in good agree- 
ment with the 'parametric' analysis S08 and G12 
in terms of spectral shape which show-cases the 
robustness of this methodology and the stability 
of the result as a whole. 

The underlying noise of the spectral point at 
~2.06 /xm was flagged by the algorithm to be dis- 
crepant with the global systematic noise model. 
This can also be seen by the higher systematic 
noise remaining in the corrected lightcurve (5 th 
from the top in figure 3) . Here the first systematic 
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Fig. 5. — Comparision between ICA derived non- 
gaussian components on the left with convention- 
ally derived OSVs as found by Swain et al. (2008). 



noise component is indicative of an overcorrection 
which is reflected in the error-bar as described in 
the previous section. Overall, the error-bars re- 
ported here are ~ 10 - 30% larger than those re- 
ported by S08 but ~ 10 - 50% smaller than those 
reported by G12. It should be noted that this 
analysis uses a slightly coarser bin size yielding 1 1 
data points whilst the S08 and G12 analyses yield 
18 spectral points. 

4. Discussion 

In this analysis we have computed the global 
SNM and found a good agreement with previously 
published results. Figure 4 shows the compari- 
son between the spectrum derived in this analy- 
sis, compared to the linear, fully parametric ap- 
proach by S08 and the non-linear, non-parametric, 
Gaussian Processes, approach by G12. Both non- 
parametric approaches yield slightly higher error 
bars than the parametric approach. We find the 
error on the ICA derived spectrum to be ~10 - 
30% bigger than those reported by S08 using a 
coarser bin size of ~ 0.09/im. These differences 
in error bars between the 'blind' and 'informed' 
(meaning the use of auxiliary information of the 
instrument) approaches are not surprising. By 
not assuming any knowledge of the data or in- 
strument, we are actively neglecting auxiliary in- 
formation helpful to the de-trending of the data 
set. 

With the linear, non-parametric blind ICA 
method presented here, the uncertainties grow by 
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Fig. 3.— Left: Raw light-curves from 1.51/im (bottom) to 2.43/zm (top) with fitted Mandel & Agol (2002) 
model overlaid. The lightcurves were off-set for clarity. Middle: Systematic noise model for each raw 
lightcurve in the left panel. Right: Final de-trended light-curves with fitted Mandel & Agol (2002) model 
overlaid. 
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Table 1 

nicmos transmission spectrum of hd189733b for a 'global' systematic noise model 
correction and plotted in figure 4. the columns are wavelength (a), planet-star- ratio 

(Rp/R s ) AND THE RESPECTIVE ERROR-BAR (a(R p /R s )). 



A (/im) 


Rp/Rs 


a(R p /R s ) x 10~ 4 


2.429 


0.155187 


3.69383 


2.336 


0.155125 


3.03754 


2.244 


0.155437 


2.83050 


2.152 


0.155000 


3.01554 


2.060 


0.155200 


12.29019 


1.968 


0.155187 


2.94831 


1.876 


0.155375 


3.09120 


1.784 


0.155187 


3.05973 


1.691 


0.154250 


2.54556 


1.599 


0.154000 


3.75548 


1.507 


0.156500 


3.28963 



up to 30 <~ 40% compared to the linear, parametric 
analysis by S08 (accounting for the larger bin sizes 
in this analysis) and a further ~ 70% when further 
relaxing the linear assumptions as for the non- 
linear, non-parametric Gaussian Processes (G12). 
In other words, we are trading smaller error-bars 
for a higher degree of objectivity. 

In figure 5 we show a comparison of the ICA 
derived nongaussian systematic noise components 
and parametric OSVs. S08 identified the X and 
Y-positional drifts on the detector to constitute 
the most important OSVs in their de-correlation 
whilst other OSVs are less significant. Similar 
independent components are also present in this 
analysis, whilst the other independent components 
differ more significantly from the parametric ap- 
proach. The agreement between the results of 
this analysis and those available in the literature, 
showcase the stability of this exoplanetary spec- 
trum. 

We find the global SNM approach to be most 
sensitive to slowly varying systematic trends 
across the data-set while local nongaussian de- 
viations tend not to be captured. It is therefore 
possible to generate, additionally, a 'local' SNM 
for a sub-set of spectral bins or before binning 
on the individual 'raw' spectral channels, should 
the SNR permit it. It is important to remember 
that k > N sn + 1 as the input to the algorithm. 



In other words, at least as many observed time 
series, Xk, are required as input to the algorithm 
than total number of nongaussian components in 
the data. In this case, the minimum 'local' SNM 
would include 5 spectral bins (4 systematic noise 
and 1 astrophysical component). With 11 spectral 
bins in total, the NICMOS data-set is too small 
for this approach. However, for larger sets, this 
two stage 'global' + 'local' detrending becomes 
a viable solution. Furthermore, multiple obser- 
vations of a transit /eclipse event with the same 
instrumental setup can be helpful to further de- 
correlate the observed data. We expect in fact 
the astrophysical signal being stable throughout 
consecutive transits, while the instrumental noise 
being largely uncorrelated between observations 
(Waldmann et al. 2012). 

5. Conclusion 

Here we present a reanalysis of a HD189733b 
primary eclipse spectrum from <~1.51 - 2.43 /jm 
obtained with Hubble/ NICMOS. This analysis dif- 
fers from previous publications in that it uses blind 
machine-learning to de-trend data with no prior 
or auxiliary information about the data or in- 
strument. Such blind-source de-convolution algo- 
rithms can be used alone or in conjunction with 
other de-correlation techniques, making them very 
powerful tools in the analysis of exoplanetary data. 
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This is especially true for instruments that lack an 
a priori calibration plan at the level of 1CT 4 pho- 
tometric precision, as needed for this field. 

We compare our results with previously pub- 
lished analyses of the same data set: another non- 
parametric, non-linear approach (G12) and the 
classical linear parametric method (Swain et al. 
2008). We find that the error-bars of this analy- 
sis are ~ 10 - 30% larger than those reported by 
Swain et al. (2008). We attribute this difference 
to the higher amount of auxiliary information in- 
jected in the parametric approach. Ultimately, it 
is a trade-off between a higher degree of objectiv- 
ity for the non-parametric methods and smaller 
errors for the parametric de-trending. Additional 
observations would have allowed much smaller er- 
ror bars and a more robust determination of the 
signal at ^2.06 /urn. 

The fact that three very different analysis tech- 
niques yield comparable spectra is a strong indica- 
tion of the stability of these results. The error bars 
estimated in this paper through ICA are smaller 
than the ones of Gibson et al. (2012) through 
Gaussian Processes, suggesting more investigation 
is needed to identify the most effective techniques 
to de-trend exoplanet atmosphere data and, most 
importantly, understand their limitations. 
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ties for Research in Astronomy, Inc., under NASA 
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A. Bootstrap Monte Carlo 

Following from section 2.3.1, an alternative way of estimate a parameter's sampling distribution is to 
use so called 'bootstrap' or 'jacknife' Monte-Carlo methods (Marcy et al. 2005; Press et al. 2007; Baluev 
2009, 2012). These methods estimate a parameter's distribution by replacing parts of the original data with a 
randomly permeated version of the data to observe the effect on a consequent model fit. For a given iteration 
of the algorithm we model-ht the time series and randomly scramble the model-subtracted residual. We then 
add the scrambled residual back to the original model-fit and repeat the process. Please see Appendix A.l 
for details of the iteration scheme. As for the MCMC algorithm, we ran the bootstrap Monte-Carlo for 
2 x 10 4 iterations and examples of the parameter distributions are found in figures 6b & 7b. It is worth 
noting that scrambling process destroys any autocorrelation in the data as well as homogenises nongaussian 
systematics in the data. 

An approach to directly measure the effects of autocorrelation in the data, the 'prayer-beed' algorithm 
has been suggested (Jenkins et al. 2002; Southworth 2008). Whereas bootstrap methods randomly replace 
parts of the original data, the 'prayer-beed' algorithm shifts the model subtracted residual along the time- 
axis for every iteration. Given the small number of the data points available in this analysis and the 
duration of the systematics being on the same or similar time scales to the transit signal, we refrain from 
using this method. Instead, we re-run the above bootstrap Monte-Carlo method with the modification of 
only replacing a randomly sized fraction, ranging from 40-100%, of the data. This modification preserves 
parts of the autocorrelation whilst the high iteration number of 2 x 10 4 ensures a sufficient sampling. The 
iteration scheme is described in Appendix A. 2 and figures 6c & 7c are examples of the transit-depth sampling 
distributions. 

A.l. Method 1: 

1. Set y c ~yk, where c is the bootstrap iteration index. 

2. Using a Nelder-Mead minimisation and the MA02 model, evaluate and record the best fit transit-depth, 

S c . 

3. Compute the model subtracted residual, r c — y c — mMA02($c), where rriMA02(8c) is the MA02 model 
with the fitted transit depth. 

4. Randomly scramble the residual to obtain r c _ permuta ted- 

5. And add the scrambled residual back on the model to obtain the new time series y c+ \ = mMAmi&c) + 

' Cjpermutated 

6. Steps 2-5 are repeated N boot times. 
A.2. Method 2: 

In the second method, we follow the procedural sequence of Method 1 but only 

1. Set y c ~yk, where c is the bootstrap iteration index. 

2. Using a Nelder-Mead minimisation and the MA02 model, we evaluate and record the best fit transit- 
depth, S c . 

3. Compute the model subtracted residual, r c = y c — m M A02{S c ), where m M A 02 {5c) is the MA02 model 
with the fitted transit depth. 

4. Randomly scramble the residual to obtain r c _ permutated . 
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Fig. 6. — Posterior distribution of fitted transit depth at 2.4291/im using: A) MCMC algorithm; B) Bootstrap 
Monte-Carlo, method 1; C) Bootstrap Monte-Carlo, method2. The sample median is marked with red, 
continuous line, the la error bars are marked with red-discontinuous lines. D) cumulative distribution 
functions of sampling distributions in plots A, B & C. 
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Fig. 7. — Posterior distribution of fitted transit depth at 1.9683/zm using: A) MCMC algorithm; B) Bootstrap 
Monte-Carlo, method 1; C) Bootstrap Monte-Carlo, method2. The sample median is marked with red, 
continuous line, the la error bars are marked with red-discontinuous lines. D) cumulative distribution 
functions of sampling distributions in plots A, B & C. 
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5. Randomly replace a fraction of the original residual, r c , with the permutated residual, r c ^ permuta ted- 
This fraction is chosen at random but held to be within 40 - 100% of the original data. We call this 
semi-permutated residual r c ^ semi . 

6. Add the above residual back on the model to obtain the new time series y c+ i = m M A02{S c ) + r c,semi 

7. Steps 2-6 are repeated N boot times. 
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